Against Spurious Sparks − Dovelating Inflated AI Claims 🕊️

ECONDAT Conference 20241

Delft University of Technology

Andrew M. Demetriou
Antony Bartlett
Cynthia C. S. Liem

May 7, 2024

Motivation

  • \(A_1\): „It is essential to bring inflation back to target to avoid drifting into deflation territory.“
  • \(A_2\): „It is essential to bring the numbers of doves back to target to avoid drifting into dovelation territory.“

“They’re exactly the same.”

— Linear probe \(\widehat{cpi}=f(A)\)

Position

Current LLMs embed knowledge. They don‘t „understand“ anything. They are useful tools, but tools nonetheless.

  • Meaningful patterns in embeddings are like doves in the sky.
  • Humans are prone to seek patterns and anthropomorphize.
  • Observed ‘sparks’ of Artificial General Intelligence are spurious.
  • The academic community should exercise extra caution.
  • Publishing incentives need to be adjusted.

Outline

  • Experiments: We probe models of varying complexity including random projections, matrix decompositions, deep autoencoders and transformers.
    • All of them successfully distill knowledge and yet none of them develop true understanding.
  • Social sciences review: Humans are prone to seek patterns and anthropomorphize.
  • Conclusion and outlook: More caution at the individual level, and different incentives at the institutional level.

There! It’s sentient!

The Holy Grail

Achievement of Artificial General Intelligence (AGI) has become a grand challenge, and in some cases, an explicit business goal.

Definition

The definition of AGI itself is not as clear-cut or consistent:

  • (loosely) a phenomenon contrasting with ‘narrow AI’ systems, that were trained for specific tasks (Goertzel 2014).

Practice

Researchers have sought to show that AI models generalize to different (and possibly unseen) tasks or show performance considered ‘surprising’ to humans.

  • For example, Google DeepMind claimed their AlphaGeometry model (Trinh et al. 2024) reached a ‘milestone’ towards AGI.

A Perfect Storm

Recent developments in the field have created a ‘perfect storm’ for inflated claims:

  • Early sharing of preprints and code.
  • Volume of publishable work has exploded.
  • Social media influencers start playing a role in article discovery and citeability (Weissburg et al. 2024).
  • Complexity is increasing because it is incentivized (Birhane et al. 2022).

“Not Mere Stochastic Parrots”

  • We consider a recently viral work (Gurnee and Tegmark 2023a), in which claims about the learning of world models by LLMs were made.
    • Linear probes (ridge regression) were successfully used to predict geographical locations from LLM embeddings.
  • Claims on X that this indicates that LLMs are not mere ‘stochastic parrots’ (Bender et al. 2021).
  • Reactions on X seemed to largely exhibit excitement and surprise at the authors’ findings.

On the unsurprising nature of latent embeddings

Are Neural Networks Born with World Models?

  • Llama-2 model tested in Gurnee and Tegmark (2023b) has ingested huge amounts of publicly available data (Touvron et al. 2023).
    • Geographical locations are literally in the training data: e.g. Wikipedia article for “London”.
    • Where would this information be encoded if not in the embedding space \(\mathcal{A}\)? Is it surprising that \(A_{\text{LDN}}=enc(\text{"London"}) \not\!\perp\!\!\!\perp (\text{lat}_{\text{LDN}},\text{long}_{\text{LDN}})\)?
  • Figure 1 shows the predicted coordinates of a linear probe on the final-layer activations of an untrained neural network.
Figure 1: Predicted coordinate values (out-of-sample) from a linear probe on final-layer activations of an untrained neural network.
  • Model has seen noisy coordinates plus \(d\) random features.
  • Single hidden layer with \(h < d\) hidden units.

PCA as a Yield Curve Interpreter

LLMs for Economic Sentiment Prediction

Human Proneness to Over-Interpretation

Spurious Relationships

Antropomorphism

Confirmation Bias

Questions?

With thanks to my co-authors Andrew M. Demetriou, Antony Bartlett, and Cynthia C. S. Liem and to the audience for their attention.

References

Bender, Emily M, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? .” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23.
Birhane, Abeba, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao. 2022. The Values Encoded in Machine Learning Research.” In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22).
Goertzel, Ben. 2014. Artificial general intelligence: concept, state of the art, and future prospects.” Journal of Artificial General Intelligence 5 (1): 1.
Gurnee, Wes, and Max Tegmark. 2023b. Language Models Represent Space and Time.” arXiv Preprint arXiv:2310.02207v2.
———. 2023a. “Language Models Represent Space and Time.” arXiv Preprint arXiv:2310.02207v1.
Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, et al. 2023. LLaMA: Open and Efficient Foundation Language Models.” https://arxiv.org/abs/2302.13971.
Trinh, T. H., Wu, Y., Le, and Q. V. et al. 2024. Solving olympiad geometry without human demonstrations. Nature 625, 476–82. https://doi.org/https://doi.org/10.1038/s41586-023-06747-5.
Weissburg, Iain Xie, Mehir Arora, Liangming Pan, and William Yang Wang. 2024. Tweets to Citations: Unveiling the Impact of Social Media Influencers on AI Research Visibility.” arXiv Preprint arXiv:2401.13782.